INTRODUCTION TO INEQUALITIES 


VIPUL NAIK 


ABSTRACT. This is a somewhat modified version of the notes I had prepared for a lecture on inequalities 
that formed part of a training camp organized by the Association of Mathematics Teachers of India for 
preparation for the Indian National Mathematical Olympiad (INMO) for students from Tamil Nadu. 


1. BASIC IDEA OF INEQUALITIES 


1.1. What we need to prove. An “inequation” is an expression of the form: 


F>0 


F] 


where F' is an expression in terms of certain variables. An “inequality” 
satisfied for all values of the variables (within a certain range). 
For instance: 


is an inequation that is 


v—xr+1>0 


and 


v?—xr—-1>0 


are both inequations. Among these, the first inequation is true for all real x, while the second 
inequation is true for all values of x within a certain range. 
Thus, when we talk of an inequality, we have the following in mind: 


e The underlying inequation 
e The range of values over which the inequality is true 


A strict inequation is an inequation of the form: 


F>0 


where F is an expression in terms of the variables. 
Given any inequation F' > 0 we can consider the corresponding strict inequation F' > 0. 
Thus, when studying an inequality, we are interested in: 


e The underlying inequation 
e The range of values over which the inequality is true 
e The values for which exact equality holds 


Some other points to note: 


e Any inequation of the form F' > G where F and G are both expressions can be written in the 
standard form as fF —G > 0. The original inequation is true for precisely those values for which 
the standard form is true. The equality conditions are also the same. 

e An inequation of the form F' < G can be expressed as G— F > 0. Again, the original inequation 
is true for precisely those values for which the standard form is true. The equality conditions 
are also the same. 
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1.2. No square is negative. This basic inequality states: 


x? >0 
The range is all « € R and equality holds iff x =0. 
This can be generalized to something of the form: 


(f(a1, 22, ee a) + (9(@1, 22, sneis Sie) 20 
The range is all « € R and equality holds iff f(v1,72,...,%n) = 9(@1, %2,...,€n) = 0. 


Problem 1. Prove that «4 — x?y? + y* > 0 for all real x and y, equality holding iff x = y = 0. 


Proof. We use: 


at — ay? + y* = (a? — y?)? + (ay)? 
Thus, (x? — y”) plays th role of f above and zy plays the role of g 
Clearly then, the left-hand-side is nonnegative, and is 0 if and only if x? = y? and zy = 0, thus forcing 
zr=y=0. 


We can extend the idea to sums of more than two squares: 


Problem 2. Prove that a? + b? + c? + ab+ be + ca > 0 with equality holding only if a= b=c=0. 


Proof. The left-hand-side can be expressed as 1/2(a* + 6? +c? + (a+b+c)?). So it is nonnegative and 
can be zero only ifa=b=c=0. 

Alternatively, the left hand side can also be written as 1/2((a+ 6)? + (b+)? + (c+.a)”) and is hence 
nonnegative, taking the value 0 if and only ifa=b=c=0 


Another problem (for which I’m not writing the solution here): 


Problem 3. Prove that a? + 6? +c? — (ab+ be + ca) > 0 with equality holding only if a= b=c. 


It turns out that one of the solution techniques for the previous problem can be applied to this one. 


1.3. Manipulating about the inequality symbol. The following results are typically used for ma- 
nipulating inequalities: 

e We can add two inequalities. The greater side gets added to the greater side, the smaller side 
to the smaller side. If either inequality is strict, the resultant inequality is again strict. More 
generally, the set of values for which the resultant inequality becomes equality is the intersection 
of the corresponding sets for each inequality. 

e Wecan multiply both sides of an inequality by a positive number. In general, however, we cannot 
multiply two inequalities. 


2. MEAN INEQUALITIES 


2.1. Definition of means. A mean is a good notion of average for a collection of numbers. A mean of 
n numbers is thus typically a function from n-tuples of reals to reals, such that: 


e If all the members of the tuple are equal, the mean should be equal to all of them. That is, if 


a= a, = a2 =...a, then the mean of aj, a2,...,@y is a. 
e The mean is a symmetric function of all the elements of the tuple, that is, if the elements are 
permuted, the value of the mean remains unchanged. That is, the mean of aj, a2,...,@n is the 


same as the mean of Gg(1)1 Go(2)3++ +5 Fo(n): 
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e The mean of a collection of positive numbers should be between the smallest number and the 
largest number. That is, if ay < ag <...< a,, the mean lies between a, and ay. 

e The mean is an increasing function in each of the arguments. That is, if a; < a‘, then the mean of 
1, Q2,--.,Q;—1, Gj, 4j41,---,@p is less than or equal to the mean of ay, d2,..., @j1, A, Gi41,---+,Gn- 


We now define some typical notions of mean: 


Definition. (1) The arithmetic mean(definea) of n real numbers a1, a2, @3,...,n is defined as: 
Q,+dg+...An 
n 
The arithmetic mean is a well-defined notion for any collection of real numbers (positive, negative 
or zero). 
(2) The geometric mean (definea) of n positive real numbers aj, a2, @3,...,@n is defined as 
(aja2.. aye 
The geometric mean is defined only for positive numbers. 
(3) The quadratic mean(definea) or the root-mean-square of n real numbers a1, a2, 43,...,@n is 
defined as: 


y a ae 
n 


(4) The harmonic mean(cefinca) of n nonzero real numbers aj, a2, @3,..-,@n is defined as: 


1 = ayot 
a, +4, +...¢a;71 


n 


For two positive reals a and b, these boil down to the formulas: 
Name of the mean Value 
Arithmetic mean {ate) 
Geometric mean Jab 
Quadratic mean \/ aye 
Harmonic mean 2a 


2.2. Inequalities for two variables. 


Claim. For positive reals a and b, Q.M. > A.M. > G.M. > H.M. 


Proof. We prove Q.M. > A.M. The remaining proofs follow along similar lines: 
What we would like to show is that, for all reals a and b: 
a? + b? Oyu b 
Di. A Ord, 
Since the left side is nonnegative, it suffices to show that the square of the left side is greater than or 
equal to the square of the right side. That is, we need to show that: 


a? + b? 2 (a+b)? 

2 i. 4 
But the latter rearranges to (a — b)? > 0. This tells us that the inequality is valid for all real a and b 
with equality holding iff a = b. 


w 


Let’s look at the pattern. The Q.M. is essentially obtained by taking the arithmetic mean of squares 
and then taking squareroot. The A.M. is obtained by taking the arithmetic mean of first powers and 
then taking the first root. The H.M. is obtained by taking the arithmetic mean of inverses and then 
taking the inverse. This suggests a general definition: 


a’ +b 1/r 
M,(a,b) = ( ) 
2 
Then the quadratic mean is Mo, the arithmetic mean is Mj, and the harmonic mean is M_ . 
By this definition, Mp does not make sense. But it turns out that, through a suitable limit argument, 
we can take Mo as the geometric mean. In that case, we have: 


We also know that: 


Does this suggest something? 


2.3. The mean inequalities: an explanation. Let a and b be positive reals. What can we say about 
the behaviour of M,(a,b) as r varies from —co to oo. It turns out that as r — —oo, M, approaches 
min{a,b}, and as r — co, M, — max{a,b}. Thus, as r steadily increases, M,(a,b) steadily goes from 
the minimum to the maximum. 

The explanation for this can be sought by viewing the r as a kind of weighting of a and b. The greater 
the value of r, the greater the dominance of the bigger term, and hence, the greater the mean is to the 
bigger term. 


2.4. The mean inequalities for many variables. The same phenomena which we observe for two 
variables also generalize to more than two variables. We define: 


1/r 
aj +agt+...ay, / 
n 


M,(a1, 2, a Qn) —F ( 


Again, as r — —oo, M,. approaches the minimum of the a,s, and as r — oo, M,. approaches the 
maximum of the a,s. 


3. CAUCHY-SCHWARZ INEQUALITY 
3.1. Statement. Let (a1, a2,...,@n) and (bj, be,...,0n) be two n-tuples of real numbers. Then: 
(D5 a?)(92 87) = (asd)? 
With equality holding if and only if one of the tuples is zero or if b; = Aa; for some fixed independent 
of 7 (that is, the tuple of b;s is a scalar multiple of the tuple of a;s). 


3.2. Vector interpretation. The vector interpretation of Cauchy Schwarz inequality looks at both 
a = (a1,Ma2,...,4n) and b = (bj, be,..., bn) as vectors in R”. Then, the left-hand-side is: 


jal? |b]? 
where |a| denotes the magnitude or length of the vector a 
The right-hand-side is the square of the dot product of the vectors, which is the same as: 
(a.b)? = |a|? |b|? cos? 6 


where @ is the angle between the vectors. Since cos? 6 < 1 and quality holds if and only if a and b are 
collinear, we get a geometric proof of Cauchy-Schwarz inequality. 
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3.3. A trigonometric problem. Consider the following problem: 


Problem 4. Maximize 
acos@ + bsin@ 


as a function of 6 where a and b are fixed reals (and not both zero). 


The idea is to view this as a dot product of vectors (a,b) and (cos @,sin@). We have: 


(a? + b”)(cos? 6 + sin? 6) > (acos 6 + bsin 6)? 


Since cos? 6 + sin? @ = 1, we obtain: 


acos@+ bsin@) < Va? + b? 
( Vv 


A necessary and sufficient condition for the magnitude of the left-hand side to be Va? + b? is that 
a/ cos @ = b/sin 6, giving tan @ = b/a. Among the two possible values for the pair (cos 0, sin @) we must 
pick the one making acos @ + bsin @ positive. 


3.4. A geometric problem. Consider the following problem: 


Problem 5. Let A and B be two points in a plane at distance 1. Find the maximum length of a path 
from A to B, comprising at most n line segments, with the property that at every stage, the distance 
from B is reducing. 


The answer is \/n. 


Proof. The idea of the proof is to use induction on n. Let f(n) denote the maximum value for a given n. 

We observe that any such optimal path is memoryless in the following sense: 

Suppose ¥ is a path from A to B comprising at most n line segments, and suppose that the first line 
segment of 7 ends at a point P. Now, the part from P to B must be composed of (n — 1) line segments 
with the property that at every stage, the distance from B is reducing. 

Now, whatever path we choose, we could replace it by a path of maximum length from P to B 
comprising (n — 1) line segments and with the property that distance from B is reducing. Since the 
original thing was longest, we conclude that the part from P to B must also be the longest one. 

Now what is the longest possible path of (n — 1) line segments from P to B? Since lengths scale, it 
is the length PB times the value f(n — 1). We thus get: 


length ofy = AP + PBf(n—-1) 
Thus the maximum of the possible lengths of 7 is the maximum over all P of the above expression. 
Now, from the fact that along the path AP, the distance from P is steadily reducing, we obtain that 
the angle ZAPB is either obtuse or right. Thus, in particular, for any given length AP, we have: 


PB<\V/1-— AP? 


If equality does not hold, we could replace P by another point Q such that AQ = AP and such that 
ZAQB = 7/2. Then, QB would be greater than PB, and hence, the length of the longest path would 
increase. Hence, we conclude that equality does indeed hold for the longest path, viz ZAPB = 7/2. 

Let 6 be ZBAP. Then AP = cos6@ and PB = sin@. We thus get: 


length of 7 = max cos 6 + f(n—1)sin@ 


Thus, applying the result of the previous problem: 


Since f(1) = 1 (clearly) we get f(n) = V/n. 


4. REARRANGEMENT AND CHEBYSHEV INEQUALITY 


4.1. Rearrangement inequality: statement. Let (a1, a2,...,a@,) and (bi, b2,...,b,) be two n-tuples 
of real numbers such that ay > ag >... > ay and by > bp > ... bn. Let o be a permutation of the 
numbers 1,2,...,n. Then: 


Ds axbj = es aidg(i) 


In other words, the sum of pairwise products is maximum if we pair the largest with the largest, the 
second largest with the second largest, and so on. 

Equality holds if and only if, for each i, aj = ag¢;) or bj = bei). 

Further: 


Dy aibg(s) = ys Qibn4i—i 


In other words, the sum of pairwise products is minimum if we pair the largest with the smallest, the 
second largest with the second smallest, and so on. 


4.2. Idea behind the inequality. Think of it as a resource allocation problem. For instance, suppose 
a thief has 3 bags and 3 kinds of coins (gold, silver, copper) to pack in them, and she must pack a 
different kind of coin in each bag. Assume further that the coins are available in unlimited quantities. 
Then, in order to maximize her loot, she will put the gold coins in the biggest bag, the silver coins in 
the second biggest bag, and the copper coins in the third biggest bag. 

The idea is: send the most to the best. Such an allocation principle is often called a greedy allocation 
principle. 

The Rearrangement inequality is best proved for two elements, and then extended by induction. Let 
a, > ag and b, > bg. Then we have: 


(a1 — a2)(b1 — b2) = 0 
Manipulating this gives us: 


a1by + agbg > aybz + agby 

The rearrangement inequality thus illustrates the general statement the principles of optimization and 
equality are often at crossroads. 

To use this to prove the result globally, we start with the expression )°, aib,(;) and locate indices i, j 
for which i < 7 but o(i) > a(j). We then change the permutation to one sending 7 to o(j) and j to a(t) 
(and having the same effect as o on the others). This local change increases the value of the expression 
and hence it is clearly not the optimum value. 

Note here that equality holds only if a; = aj or bg (i) = bo(j)- 


4.3. An application of rearrangement. Consider the following problem I had mentioned earlier: 


Prove that a? + b? + c? — (ab+ be + ca) > 0 with equality holding only if a = b= c. 
This problem can also be solved using the rearrangement inequality. First observe that since the 
expression is symmetric in a, b and c, we can assume without loss of generality that a > b> c. 
Consider the triple (a,b,c). This is an ordered triple with the property that the elements are in non- 
increasing order. Then (b,c, a) is a permutation of this expression. Thus, by rearrangement inequality: 


aa+bb+cc > ab+bce+ca 
Which gives us what we want. 
Also note that in this case, equality holds if and only if a=b=c. 


4.4. Chebyshev inequality. Chebyshev inequality says that sending the most to the best is better 
than giving the average to the average. More formally, if (a1,@2,...,@n) and (b1, be,...,0n) are two 
n-tuples of decreasing reals: 


i Gi D5 Oi 
d, ajb; = oe ees 


Where equality holds iff either all the a;s are equal or all the bjs are equal. 
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4.5. Fundamental difference between Chebyshev and Cauchy-Schwarz. Both the Chebyshev 
and the Cauchy-Schwarz inequalities are similar in the following sense: 


e They are both true for all reals 
e They both provide bounds of >7, a:b; 


But they are different in the following ways: 


e In Chebyshev, it is important to order the a;s and b;s in descending order, whereas Cauchy- 
Schwarz is applicable for any ordering 

e Chebyshev gives a bound in terms of 57, a; and >>, 6; while Cauchy-Schwarz gives a bound in 
terms of the sums of their squares. 

e Chebyshev provides a lower bound on 5°, a;b; while Cauchy-Schwarz provides an upper bound 

e The equality case is different in both. In Chebyshev, equality holds if all the elements in one of 
the tuples are equal. In Cauchy-Schwarz, equality holds if the two tuples are scalar multiples of 
one another. 


A word of caution, though, when deciding whether to apply Chebyshev or Cauchy-Schwarz. Just 
because the inequality seems to require a lower bound on )7, F;G;, does not mean that Chebyshev is the 
one to be used. In fact, we could still use Cauchy-Schwarz by taking a; = F;G; and b; to be 1/F;. 


5. NESBITT’S INEQUALITY 


5.1. Statement of the inequality. 


Problem 6 (Nesbitt’s inequality). For positive a, b and c, prove that: 


a | Ua ig 
b+e cta atb~ 2 


with equality holding if and only ifa=b=c. 


5.2. Applying Cauchy-Schwarz (direct application fails). To apply Cauchy-Schwarz we need to 


put the terms ;~ and its analogues on the left side, which means we should view each of them as a 


square. Their squareroots are ,/;¢z and its analogues. Thus, one tuple is: 


We would like the other tuple to be something that cancels the denominator. A natural choice is 
(V/ b+c,J/eta, Jat b). Unfortunately, this fails to yield the answer, because the expression that we 
get is upper-bounded, rather than lower-bounded, in the case of equality. 


5.3. Applying Chebyshev. Consider the tuples (a,b,c) and ((b+ c)~!, (c+ a)~1,(a+)~). We first 
need to determine whether they are arranged in the same order. Assume without loss of generality that 
a>b>c. Thenb+c<c+a<a+b, and taking inverses, we obtain that the second tuple also has its 
coordinates in descending order. 

We are thus in a position to apply Chebyshev’s and obtain that the give expression is at least: 


(atb+o(bto) t+ (cta'+(at+b)™) 
3 
Now using A.M.-H.M. inequality for the quantities (b+ c), (c+ a) and (a+ b), we get the required 
result. 


5.4. A short proof. Another way of proving the result is to add and subtract 3, thus writing it as: 


(arb (5 + : + =) 


b+e ct+ta a+b 


And now apply the A.M.-H.M. inequality. 


6.1. The problem statement. 


6. A PAST IMO PROBLEM 


Problem 7 (IMO 1995). Prove that if a, b and c are positive reals such that abc = 1, then: 


1 


1 


1 3 


a®(b+c) 


a b3(c + a) = c3(a +b) 


> 
2 


The first trick is to put  =1/a, y=1/b and z =1/c. The left-hand side becomes: 


6.2. Cauchy-Schwarz. After this point, the first p 
; and its analogues as squares of a tuple. The other 


want to lower-bound the sum here, we must view 


2 2 
eee 


22 


yrs z+ 


2 


+ 
crc «ty 


ossibility to consider is Cauchy-Schwarz. Since we 


tuple is obtained by cancelling denominators from third tuple. We thus have tuples: 


and 


( 


x y Zz ) 
Sy tz’ Jzta’ Jay 
(Vy +z,vz+2, /e+y) 


We apply Cauchy-Schwarz to these tuples, and then use A.M.-G.M. inequality and the fact that 


ryz=1. 


If we keep track of the inequality constraints at each step, we obtain that equality holds if and only 
ife=y=z=1,and hencea=b=c=1. 
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